Unpod Voice Infrastructure

How it works

The unpod SDK connects your AgentRunner to Unpod’s voice platform over WebSocket. Unpod handles STT, TTS, telephony, numbers, and recording — your code handles dialog logic using a SuperDialog DialogMachine or LLMAgent.

Caller → Phone number (Unpod) → STT → AgentRunner → DialogMachine → TTS → Caller

Your dialog machine runs inside your process, in the same call as the rest of your agent logic. No separate WebSocket server needed.

Step 1 — Build your dialog machine

from superdialog import DialogMachine, Flow, PythonTool

def lookup_customer(phone: str) -> dict:
    """Look up customer by phone number."""
    return crm.get_by_phone(phone)

dialog_machine = DialogMachine(
    flow=Flow.load("kyc.json"),
    llm="anthropic/claude-haiku-4-5",
    tools=[PythonTool.of(lookup_customer)],
)

See SuperDialog Quickstart to generate and save a flow file.

Step 2 — Plug the machine into your session

Assign dialog_machine to ctx.session.dialog_machine inside your AgentRunner entrypoint. The SDK auto-wraps it — no adapter import needed.

from unpod import AgentRunner, CallContext
from superdialog import DialogMachine, Flow, PythonTool

def lookup_customer(phone: str) -> dict:
    """Look up customer by phone number."""
    return crm.get_by_phone(phone)

async def handle_call(ctx: CallContext) -> None:
    # Build (or re-use) the machine per call
    machine = DialogMachine(
        flow=Flow.load("kyc.json"),
        llm="anthropic/claude-haiku-4-5",
        tools=[PythonTool.of(lookup_customer)],
    )
    ctx.session.dialog_machine = machine   # auto-wrapped by SuperDialogAdapter
    await ctx.session.run()                # blocks until the call ends

AgentRunner(
    entrypoint=handle_call,
    agent_id="kyc-bot",
).start()

Step 3 — Create and register your Speech Pipe

import asyncio
from unpod import AsyncClient

async def setup():
    async with AsyncClient() as client:
        profiles = await client.voice_profiles.list(language="en")
        pipe = await client.pipes.create(
            name="KYC Bot",
            voice_profile=profiles[0].profile_id,
            agent_id="kyc-bot",    # must match AgentRunner(agent_id=...)
            recording=True,
        )

        numbers = await client.numbers.list()
        if numbers:
            await client.numbers.attach(numbers[0].id, pipe_id=pipe.pipe_id)

        print(f"Speech Pipe: {pipe.pipe_id}, Number: {numbers[0].number if numbers else 'none'}")

asyncio.run(setup())

Complete example

import asyncio
import os
from superdialog import DialogMachine, Flow, PythonTool
from unpod import AgentRunner, AsyncClient, CallContext

# --- One-time setup (run once) ---

async def setup():
    async with AsyncClient() as client:
        profiles = await client.voice_profiles.list(language="en")
        pipe = await client.pipes.create(
            name="KYC Bot",
            voice_profile=profiles[0].profile_id,
            agent_id="kyc-bot",
            recording=True,
        )
        numbers = await client.numbers.list()
        if numbers:
            await client.numbers.attach(numbers[0].id, pipe_id=pipe.pipe_id)

# asyncio.run(setup())   # Run once to provision

# --- Load flow (built with superdialog CLI or create_dialog_flow) ---

def lookup_aadhaar(partial: str) -> dict:
    """Look up customer by partial Aadhaar."""
    return crm.lookup_by_partial_aadhaar(partial)

# --- Runner (long-lived process) ---

async def handle_call(ctx: CallContext) -> None:
    machine = DialogMachine(
        flow=Flow.load("kyc.json"),
        llm="anthropic/claude-haiku-4-5",
        tools=[PythonTool.of(lookup_aadhaar)],
    )
    ctx.session.dialog_machine = machine
    await ctx.session.run()

AgentRunner(
    entrypoint=handle_call,
    agent_id="kyc-bot",
).start()

Using pre-call data in the flow

Data passed when triggering an outbound call (or injected by the platform) is available on ctx.data:

async def handle_call(ctx: CallContext) -> None:
    machine = DialogMachine(
        flow=Flow.load("onboarding.json"),
        llm="anthropic/claude-haiku-4-5",
    )
    # Inject caller context before the first turn
    if customer_name := ctx.data.get("customer_name"):
        machine.assist(f"The customer's name is {customer_name}. Address them by name.")

    ctx.session.dialog_machine = machine
    await ctx.session.run()

Mid-call context injection

Inject system instructions at any point during an active call from your own business logic:

async def handle_call(ctx: CallContext) -> None:
    machine = DialogMachine(flow=Flow.load("support.json"), llm="openai/gpt-4o-mini")

    @ctx.session.on("user_turn")
    async def _(text: str) -> None:
        sentiment = await analyze_sentiment(text)
        if sentiment == "frustrated":
            machine.assist("The customer seems frustrated. Be empathetic and offer escalation.")

    ctx.session.dialog_machine = machine
    await ctx.session.run()

Switching flows mid-call

async def handle_call(ctx: CallContext) -> None:
    machine = DialogMachine(flow=Flow.load("triage.json"), llm="openai/gpt-4o-mini")

    @ctx.session.on("user_turn")
    async def _(text: str) -> None:
        if "billing" in text.lower():
            machine.switch_flow(Flow.load("billing.json"), preserve_memory=True)

    ctx.session.dialog_machine = machine
    await ctx.session.run()

vs. LiveKit / PipeCat adapters

	Unpod Voice (SDK)	LiveKit adapter	PipeCat adapter
Who handles STT/TTS	Unpod	You (via LiveKit plugins)	You (via PipeCat services)
Who handles telephony	Unpod	You / LiveKit SIP	You / Twilio / etc.
Dialog runs in	Your AgentRunner process	Your LiveKit agent	Your PipeCat pipeline
Best for	Fastest path to production voice	Full media layer control	Existing PipeCat pipelines

Next Steps

SDK Setup

AgentRunner constructor, credentials, and lifecycle.

Session Controls

say(), transfer(), recording controls during live calls.

SuperDialog Flows

Build and save conversation flows.

SuperDialog Tools

Add Python, HTTP, and MCP tools to your machine.

​How it works

​Step 1 — Build your dialog machine

​Step 2 — Plug the machine into your session

​Step 3 — Create and register your Speech Pipe

​Complete example

​Using pre-call data in the flow

​Mid-call context injection

​Switching flows mid-call

​vs. LiveKit / PipeCat adapters

​Next Steps